cSPADE -UE: Algorithm for Sequence Mining for Unstructured Elements Using Time Gap Constraints

نویسنده

  • Pinkal Shah
چکیده

-We present a new state machine that combines two techniques for complex data sequences: Data modeling and frequent sequence mining. This algorithm relies on unstructured variable gap sequence miner, to mine frequent patterns with different gap between elements. Here we will have two variations: Sequence pruning technique for other primary frequent sequences to reduce space complexity and allow creating same sequence to form even if they do not have matching on all the positions. We apply algorithm to task of protein sequence classification on real data from protein families. A state of the art method for protein classification, by decreasing the state space complexity and improving the accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithm for sequence mining using gap constraints

The sequence mining problem consists in finding frequent sequential patterns in a database of timestamped events. Some application domains require limiting the maximum temporal gap between events in the input sequences. However concentration on such constraint is critical for most sequence mining algorithms. In this paper we describe CCSM (Cache-based Con-strained Sequence Miner), a new level-w...

متن کامل

Adaptive Unstructured Grid Generation Scheme for Solution of the Heat Equation

An adaptive unstructured grid generation scheme is introduced to use finite volume (FV) and finite element (FE) formulation to solve the heat equation with singular boundary conditions. Regular grids could not acheive accurate solution to this problem. The grid generation scheme uses an optimal time complexity frontal method for the automatic generation and delaunay triangulation of the grid po...

متن کامل

An Efficient Algorithm for Mining Frequent Sequence with Constraint Programming

The main advantage of Constraint Programming (CP) approaches for sequential pattern mining (SPM) is their modularity, which includes the ability to add new constraints (regular expressions, length restrictions, etc). The current best CP approach for SPM uses a global constraint (module) that computes the projected database and enforces the minimum frequency; it does this with a filtering algori...

متن کامل

NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints.

Sequence pattern mining aims to discover frequent subsequences as patterns in a single sequence or a sequence database. By combining gap constraints (or flexible wildcards), users can specify special characteristics of the patterns and discover meaningful subsequences suitable for their own application domains, such as finding gene transcription sites from DNA sequences or discovering patterns ...

متن کامل

Estimation of geochemical elements using a hybrid neural network-Gustafson-Kessel algorithm

Bearing in mind that lack of data is a common problem in the study of porphyry copper mining exploration, our goal was set to identify the hidden patterns within the data and to extend the information to the data-less areas. To do this, the combination of pattern recognition techniques has been used. In this work, multi-layer neural network was used to estimate the concentration of geochemical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014